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In this work, we attempt to capture patterns of co-occurrence across vowel systems and at 
the same time figure out the nature of the force leading to the emergence of such patterns. 
' For this purpose we define a weighted network where the vowels are the nodes and an edge 

O between two nodes (read vowels) signify their co-occurrence likelihood over the vowel 

' ^ , inventories. Through this network we identify communities of vowels, which essentially 

reflect their patterns of co-occurrence across languages. We observe that in the assortative 
(-H ' vowel communities the constituent nodes (read vowels) are largely uncorrelated in terms 

^j^, of their features and show that they are formed based on the principle of maximal 

perceptual contrast. However, in the rest of the communities, strong correlations are 
reflected among the constituent vowels with respect to their features indicating that 
it is the principle of feature economy that binds them together. We validate the above 
rN( _ observations by proposing a quantitative measure of perceptual contrast as well as feature 

economy and subsequently comparing the results obtained due to these quantiflcations 
with those where we assume that the vowel inventories had evolved just by chance. 

Keywords: Vowels; complex network; community structure; feature entropy. 



1. Introduction 

Linguistic research has documented a wide range of regularities across the sound 
systems of the world's languages [U O [TH [T31 [TTl [T5] . Functional phonologists argue 
that such regularities are the consequences of certain general principles like maximal 
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perceptual contrast [12j . ease of articulatioi^ [21 [H], and ease of learnabilit^ [2]. In 
the study of vowel systems the optimizing principle, which has a long tradition [9l[25] 
in linguistics, is maximal perceptual contrast. A number of numerical studies based 
on this principle have been reported in literature [12l[T3l[21]. Of late, there have been 
some attempts to explain the vowel systems through multi agent simulations ^2 and 
genetic algorithms [lOj ; all of these experiments also use the principle of perceptual 
contrast for optimization purposes. 

An exception to the above trend is a school of linguists [31 [6] who argue that per- 
ceptual contrast-based theories fail to account for certain fundamental aspects such 
as the patterns of co-occurrence of vowels based on similar acoustic/ articulatory 
feature^ observed across the vowel inventories. Instead, they posit that the ob- 
served patterns, especially found in larger size inventories 0, can be explained only 
through the principle of feature economy [7l[T6]. According to this principle, lan- 
guages tend to maximize the combinatorial possibilities of a few distinctive features 
to generate a large number of sounds. 

The aforementioned ideas can be possibly linked together through the example 
illustrated by Figuredl As shown in the figure, the initial plane P constitutes of a set 
of three very frequently occurring vowels /i/ , jaj and /w/, which usually make up 
the smaller inventories and do not have any single feature in common. Thus, smaller 
inventories are quite likely to have vowels that exhibit a large extent of contrast in 
their constituent features. However, in bigger inventories, members from the higher 
planes {P' and P") are also present and they in turn exhibit feature economy. For 
instance, in the plane P' comprising of the set of vowels /i/, /a/, /ii/, we find 
a nasal modification applied equally on all the three members of the set. This is 
actually indicative of an economic behavior that the larger inventories show while 
choosing a new feature in order to reduce the learnability effort of the speakers. 
The third plane P" reinforces this idea by showing that the larger the size of the 
inventories the greater is the urge for this economy in the choice of new features. 
Another interesting facet of the figure are the relations that exist across the planes 
(indicated by the broken lines). All these relations are representative of a common 

''Maximal perceptual contrast, is desirable between the phonemes of a language for proper per- 
ception of each individual phoneme in a noisy environment 

''Ease of articulation requires that the sound systems of all languages are formed of certain uni- 
versal (and highly frequent) sounds. 

"^Ease of learnability is required so that a speaker can learn the sounds of a language with minimum 
effort. 

"^In linguistics, features are the elements, which distinguish one phoneme from another. The fea- 
tures that describe the vowles can be broadly categorized into three different classes namely the 
height^ the backness and the roundedness. Height refers to the vertical position of the tongue rel- 
ative to either the roof of the mouth or the aperture of the jaw. Backness refers to the horizontal 
tongue position during the articulation of a vowel relative to the back of the mouth. Rounded- 
ness refers to whether the lips are rounded or not during the articulation of a vowel. There are 
however still more possible features of vowel quality, such as the velum position (e.g., nasality), 
type of vocal fold vibration (i.e., phonation), and tongue root position (i.e., secondary place of 
articulation). 



2, 2008 9:15 WSPC/INSTRUCTION FILE paperADV 



Rediscovering the Co-occurence Principles of Vowel Inventories 3 




Fig. 1. The organizational principles of the vowels (in decreasing frequency of occurrence) indicated 
through different hypothetical planes. 



linguistic concept of robustness [B] in which one frequently occurring vowel (say /i/) 
implies the presence of the other (and not vice versa) less frequently occurring vowel 
(say /i/) in a language inventory. These cross-planar relations are also indicative 
of feature economy since all the features present in the frequent vowel (e.g., /i/) 
are also shared by the less frequent one (e.g., /i/). In summary, while the basis 
of organization of the vowel inventories is perceptual contrast as indicated by the 
plane P in Figure (TJ economic modifications of the perceptually distinct vowels 
takes place with the increase in the inventory size (as indicated by the planes P' 
and P" in Figure [1]). 

In this work we attempt to corroborate the above conjecture by automatically 
capturing the patterns of co-occurrence that are prevalent in and across the planes 
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illustrated in Figure [H We also present a quantitative measure of the driving forces 
that lead to the emergence of such patterns and show that the real inventories are 
significantly better in terms of this measure than expected. In order to do so, we 
define the "Vowel- Vowel Network" or VoNet, which is a weighted network where 
the vowels are the nodes and an edge between two nodes (read vowels) signify 
their co-occurrence likelihood over the vowel inventories. We conduct community 
structure analysis of different versions of VoNet in order to capture the patterns of 
co-occurrence in and across the planes P, P' and P" shown in Figure[T] The plane P 
consists of the communities, which are formed of those vowels that have a very high 
frequency of occurrence (usually assortative [19] in nature). We observe that the 
constituent nodes (read vowels) of these assortative vowel communities are largely 
uncorrelated in terms of their features and quantitatively show that they indeed 
exhibit a higher than expected level of perceptual contrast. On the other hand, 
the communities obtained from VoNet, in which the links between the assortative 
nodes are absent, corresponds to the co-occurrence patterns of the planes P' and 
P" . In these communities, strong correlations are reflected among the constituent 
vowels with respect to their features and they indeed display a significantly better 
feature economy than it could have been by random chance. Moreover, the co- 
occurrences across the planes can be captured by the community analysis of VoNet 
where only the connections between the assortative and the non-assortative nodes, 
with the non-assortative node co-occurring very frequently with the assortative one, 
are retained while the rest of the connections are filtered out. We also show that 
these communities again exhibit a significantly higher feature economy than feasible 
by chance. 

This article is organized as follows: Section [2] describes the experimental setup in 
order to explore the co-occurrence principles of the vowel inventories. In this section 
we formally define VoNet, outline its construction procedure, present a community- 
finding algorithm, and also present a quantitative definition for maximal percep- 
tual contrast as well as feature economy. In section [3] we report the experiments 
performed to obtain the community structures, which are representative of the co- 
occurrence patterns in and across the planes discussed above. We also report results 
where we measure the driving forces that lead to the emergence of such patterns 
and show that the real inventories are substantially better in terms of this measure 
than those where the inventories are assumed to have evolved by chance. Finally, 
we conclude in section |4] by summarizing our contributions, pointing out some of 
the implications of the current work and indicating the possible future directions. 

2. Experimental Setup 

In this section we systematically develop the experimental setup in order to inves- 
tigate the co-occurrence principles of the vowel inventories. For this purpose, we 
formally define VoNet, outline its construction procedure, describe a community- 
finding algorithm to decompose VoNet to obtain the community structures, and 
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Fig. 2. A partial illustration of the nodes and edges in VoNet. The labels of the nodes denote the 
vowels represented in IPA (International Phonetic Alphabet). The numerical values against the 
edges and nodes represent their corresponding weights. For example /i/ occurs in 393 languages; 
/e/ occurs in 124 languages while they co-occur in 117 languages. 

define the metrics required in order to explore the co-occurrence principles of the 
observed communities. 

2.1. Definition and Construction of VoNet 

Definition of VoNet: We define VoNet as a network of vowels, represented as G 
= ( Vy, E ) where Vy is the set of nodes labeled by the vowels and E is the set of 
edges occurring in VoNet. There is an edge e G E between two nodes, if and only if 
there exists one or more language(s) where the nodes (read vowels) co-occur. The 
weight of the edge e (also edge-weight) is the number of languages in which the 
vowels connected by e co-occur. The weight of a node u (also node-weight) is the 
number of languages in which the vowel represented by u occurs. In other words, 
if a vowel Vi represented by the node u occurs in the inventory of n languages then 
the node-weight of u is assigned the value n. Also if the vowel Vj is represented 
by the node v and there are w languages in which vowels Vi and Vj occur together 
then the weight of the edge connecting u and v is assigned the value v. Figure [2] 
illustrates this structure by reproducing some of the nodes and edges of VoNet. 

Construction of VoNet: Many typological studies [3 [HI [ill [H [H HH] of 

segmental inventories have been carried out in past on the UCLA Phonological 
Segment Inventory Database (UPSID) [15]. Currently UPSID records the sound 
inventories of 451 languages covering all the major language families of the world. 
In this work we have therefore used UPSID comprising of these 451 languages and 
180 vowels found across them, for constructing VoNet. Consequently, the set Vy 
comprises of 180 elements (nodes) and the set E comprises of 3135 elements (edges). 
Figure [3| presents a partial illustration of VoNet as constructed from UPSID. 
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Fig. 3. A partial illustration of VoNet. All edges in this figure have an edge-weight greater than 
or equal to 15. The number on each node corresponds to a particular vowel. For instance, node 
number 72 corresponds to /i/. 



2.2. Finding Community Structures 

We attempt to identify the communities appearing in VoNet by the extended Radic- 
chi et al. [20j algorithm for weighted networks as introduced by us in an earlier 
article [17]. The basic idea is that if the weights on the edges forming a triangle 
(loops of length three) are comparable then the group of vowels represented by this 
triangle highly occur together rendering a pattern of co-occurrence while if these 
weights are not comparable then there is no such pattern. In order to capture this 
property we define a strength metric S for each of the edges of VoNet as follows. 
Let the weight of the edge {u,v), where m, u G Vc, be denoted by Wuv We define 
5* as, 

S= I (1) 



if Y X]ieVc-{" v) ('"^f* ~ Wvif' > else S = oo. The denominator in this expression 
essentially tries to capture whether or not the weights on the edges forming triangles 
are comparable (the higher the value of S the more comparable the weights are). 
The network can be then decomposed into clusters or communities by removing 
edges that have S less than a specified threshold (say 77). 
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At this point it is worthwhile to clarify the significance of a vowel community. 
A community of vowels actually refers to a set of vowels which occur together in 
the language inventories very frequently. In other words, there is a higher than 
expected probability of finding a vowel v in an inventory which already hosts the 
other members of the community to which v belongs. For instance, if /i/, jaj 
and juj form a vowel community and if jij and jaj are present in any inventory 
then there is a very high chance that the third member juj is also present in the 
inventory. 

2.3. Definition of the Metrics 

Once the communities are obtained through the algorithm discussed earlier the next 
important task is to analyze them so as to capture the binding force that keeps them 
together. For this purpose, we need to have a quantitative measure for perceptual 
contrast as well as feature economy. In order to establish that the above forces really 
play a role in the emergence of the communities, we also need to compare and show 
that the communities are much better in terms of this measure than it would have 
been if the vowel inventories had evolved by chance. In the rest of this section we 
detail out the metric for quantification as well as the metric for comparison. 

2.3.1. Metric for Quantification 

For a community C of size N let there he pf vowels, which have a particular feature / 
(where / is assumed to be boolean in nature in common and qf other vowels, which 
lack the feature /. Thus, the probability that a particular vowel chosen uniformly 
at random from C has the feature / is ^ and the probability that the vowel lacks 
the feature / is 7^ (=1-^). If F be the set of all features present in the vowels in 
C then feature entropy Fe can be defined as 

Fe is essentially the measure of the number of bits that are required to communi- 
cate the information about the entire community C through a channel. 

Capturing Perceptual Contrast: If C comprises of a set of perceptually dis- 
tinct vowels, then larger number of bits should be required to communicate the 
information about C over the transmission channel since in this case the set of 
features that constitute the vowels are more in number. Therefore, the higher the 
perceptual contrast the higher is the feature entropy. The idea is illustrated through 
the example in Figure [4l In the figure, Fe exhibited by the community Ci is higher 
than that of the community C2, since the set of vowels in Ci are perceptually more 

"^There are 28 such boolean features that are found across the vowel systems recorded in UPSID. 
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Fig. 4. for the two different communities Ci and C2- The letters h, f, b, r, u, 1, and n stand 
for the features high, front, back, rounded, unrounded, and nasahzed respectively. 



distinct than those in C2. 

Capturing Feature Economy: To have more information conveyed using a 
fewer number of bits, maximization of the combinatorial possibihties of the features 
used by the constituent vowels in the community C is needed, which is precisely the 
prediction made by the principle of feature economy. Therefore the lower the feature 
entropy the higher is the feature economy. In fact, it is due to this reason that in 
Figure [5l Fe exhibited by the community Ci is lower than that of the community 
C2, since in Ci the combinatorial possibilities of the features is better utilized by 
the vowels than in C2. 

2.3.2. Metric for Comparison 

For the purpose of the comparison as discussed earlier, we construct a random 
version of VoNet, namely VoNet^and- Let the frequency of occurrence for each vowel 
V in UPSID be denoted by /„. Let there be 451 bins each corresponding to a language 
in UPSID. /„ bins arc then chosen uniformly at random and the vowel v is packed 
into these bins. Thus the vowel inventories of the 451 languages corresponding to 
the bins are generated. In such randomly constructed inventories the effect of none 
of the forces (perceptual contrast or feature economy) should be prevalent as there 
is no strict co-occurrence principle that plays a role in the inventory construction. 
Therefore these inventories should show a feature entropy no better than expected 
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Fig. 5. Fe for the two different communities Ci and C'2- The letters h, f, b, r, u, 1, and n stand 
for the features high, front, back, rounded, unrounded, long, and nasalized respectively. 

by random chance and hence can act as a basehne for all our experiments reported 
in the following section. VoNetrand can be then constructed from these new vowel 
inventories similarly as VoNet. The method for the construction is summarized in 
Algorithm 1. 

Algorithm 1. Algorithm to construct VoNetrand 
for each vowel v 

{ 

for i = 1 to fv 
{ 

Choose one of the 451 bins, corresponding to the languages in UPSID, 
uniformly at random; 

Pack the vowel v into the bin so chosen if it has not been already 
packed into this bin earlier; 

} 

} 

Construct YoNetrand, similarly as VoNet, from the new vowel inventories (each bin 
corresponds to a new inventory); 

3. Experiments and Results 

In this section we describe the experiments performed and the results obtained from 
the analysis of VoNet. In order to find the co-occurrence patterns in and across the 
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Fig. 6. The frequency (y-axis) versus rank (x-axis) curve in log-log scale illustrating the distribution 
of the occurrence of the vowels over the language inventories of UPSID. 



planes of Figure [T] we define three versions of VoNct namely VoNetassort, VoNet^est 
and VoNetrest' • The construction procedure for each of these versions are presented 
below. 

Construction of VoNetassor*: VoNetassort comprises of the assortativ^ nodes 
having node-weights above 120 (i.e, vowels occurring in more than 120 languages 
in UPSID), along with only the edges inter-connecting these nodes. The rest of the 
nodes (having node- weight less than 120) and edges are removed from the network. 
We make a choice of this node- weight for classifying the assortative nodes from the 
non-assortative ones by observing the distribution of the occurrence frequency of 
the vowels illustrated in Figure [6l The curve shows the frequency of a vowel (y-axis) 
versus the rank of the vowel according to this frequency (x-axis) in log-log scale. The 
high frequency zone (marked by a circle in the figure) can be easily distinguished 
from the low-frequency one since there is distinct gap featuring between the two in 
the curve. 

Figure [7] illustrates how VoNetassort is constructed from VoNet. Presently, the 
number of nodes in VoNeta^sort is 9 and the number of edges is 36. 

Construction of VoNetrest: VoNetrest comprises of all the nodes as that of 
VoNet. It also has all the edges of VoNet except for those edges that inter-connect 



'The term "assortative node" here refers to the nodes having a very high node-weight. 
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Fig. 7. The construction procedure of VoNetasaort from VoNet. 



the assortative nodes. Figure[S]shows how VoNetrest can be constructed from VoNet. 
The number of nodes and edges in VoNetrest are 180 and 129^ respectively. 

Construction of VoNetrest': VoNetrest' again comprises of all the nodes as 
that of VoNet. It consists of only the edges that connect an assortative node with a 
non-assortative one if the non-assortative node co-occurs more than ninety five per- 
cent of times with the assortative nodes. The basic idea behind such a construction 
is to capture the co-occurrence patterns based on robustness [61 (discussed earlier in 
the introductory section) that actually defines the cross-planar relationships in Fig- 
ure [TJ Figure [9] shows how VoNetrest' can be constructed from VoNet. The number 
of nodes in VoNetrest' is 180 while the number of edges is 

We separately apply the community-finding algorithm (discussed earlier) on each 
of VoNetassort, VoNctrest and VoNetrest' in order to obtain the respective vowel 
communities. We can obtain different sets of communities by varying the threshold 
rj. A few assortative vowel communities (obtained from YoNetassort) are noted in 
Table[T] Some of the communities obtained from VoNet^-est are presented in TableO 
We also note some of the communities obtained from VoNetrest' in Table [3l 



^We have neglected nodes with node- weight less than 3 since these nodes correspond to vowels that 
occur in less than 3 languages in UPSID and the communities they form are therefore statistically 
insignificant. 

''The network docs not get disconnected due to this construction since, there is always a small frac- 
tion of edges that run between assortative and low node-weight non-assortative nodes of otherwise 
disjoint groups. 
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Fig. 8. The construction procedure of VoNetrest from VoNet. 




Fig. 9. The construction procedure of VoNet^g^j/ from VoNet. 



Tables[T],[5]and[3]indicate that the communities in VoNetassort are formed based 
on the principle of perceptual contrast whereas the formation of the communities in 
VoNetrest as well as VoNetrest' is largely governed by feature economy. We dedicate 
the rest of this section mainly to verify the above argument. For this reason we 
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Table 1. Assortative vowel communities. The contrastive features separated by slashes (/) are 
shown within parentheses. Comma-separated entries represent the features that are in use from 
the three respective classes namely the height, the backness, and the roundedncss. 



Community 


Features in Contrast 


N, /a/, H 


(low/high), (front /central/back), (unrounded/rounded) 


/e/, /o/ 


(higher- mid/mid), (front /back) , (unrounded/rounded) 



Table 2. Some of the vowel communities obtained from VoNctrest. 



Community 


Features in Common 


/i/. /a/, /u/ 


nasahzed 


/i:/, /a:/. /G:/ 


long, nasalized 


/i:/, /u:/, /a:/, /o:/, /e:/ 


long 



Table 3. Some of the vowel communities obtained from VoNet^g^j/. Comma-separated entries 
represent the features that are in use from the three respective classes namely the height, the 
backness, and the roundedncss. 



Community 


Features in Common 


null 


high, front, unrounded 


/a/, /a/ 


low, central, unrounded 


/u/, /u/ 


high, back, rounded 



present a detailed study of the co-occurrence principles of the communities obtained 
from VoNetassort, VoNetrest, and VoNetrest'- In each case we compare the results 
with those of VoNet^and obtained from Algorithm 1 . 

3.1. Co-occurrence Principles of the Communities of VoNetassort 

We apply the community-finding algorithm (discussed earlier) on YoNetrand hi order 
to obtain the assortative communities similarly as outlined for VoNet. Figure [10] 
illustrates, for all the communities obtained from the clustering of VoNetassort and 
its random version, the average feature entropy exhibited by the communities of a 
particular sizc0 (y-axis) versus the community size (x-axis). 

A closer inspection of Figure [10] immediately reveals that the feature entropy 
exhibited by the communities of VoNetassort is higher as compared to the random 
version of the same. The two curves finally intersect due to the formation of a 
single giant component, which is similar for the real and the random edition of 
VoNetassort- Nevertheless, the data points that appear on these curves are fairly 

'Let there be n communities of a particular size k picked up at various thresholds. The average 
feature entropy of the communities of size k is therefore -^"^"—i Fe^ where Fe^ signifies the feature 
entropy of the i*'' community. 
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Fig. 10. Curves showing the average feature entropy of the communities of a particular size versus 
the community size for VoNetassort as well as its random counterpart. 

less in number and hence Figure [10] alone is not sufficient enough to establish that 
the communities in VoNetassort ai'e formed based on the principle of perceptual 
contrast. Another possible way to investigate the problem would be to look into 
the co-occurrence principles of the smaller vowel inventories (of size < 4) since they 
mostly comprise of the members belonging to the assortative vowel communities. 
Tabled] for instance, shows the number of occurrences of the members of the com- 
munity formed by /%/, /a/, and /u/, as compared to the average occurrence of 
other vowels, in the inventories of size 3 and 4. The figures in the table points to 
the fact that the smaller inventories can be assumed to be good representatives 
of the assortative vowel communities. We therefore compare the average feature 
entropy of these inventories as a whole with their random counterparts (obtained 
from Algorithm 1). Figure [TT] illustrates the result of this comparison. The curves 
depict the average feature entropy of the vowel inventories of a particular size (y- 
axis) versus the inventory size (x-axis). The two different plots compare the average 
feature entropy of the inventories obtained from UPSID with that of the randomly 
constructed ones. The figure clearly shows that the average feature entropy of the 
vowel inventories of UPSID is substantially higher for inventory size 3 and 4 than 
that of those constructed randomly. 

The results presented in Figures \TU\ and [TT] together confirms that the assorta- 
tive vowel communities are formed based on the principle of maximal perceptual 
contrast. 
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Table 4. Frequency of occurrence of the members of the community /i/, /a/, and /u/, as compared 
to the frequency occurrence of other vowels, in smaller inventories. The last column indicates the 
average number of times that a vowel other than /«/, /a/, and /«/ occurs in the inventories of 
size 3 and 4. 



Inv. Size 


No. of Invs. 


Occ. /i/ 


Occ. /a/ 


Occ. /u/ 


Avg. Occ. other vowels 


3 


23 


15 


21 


12 


3 


4 


25 


19 


24 


11 


3 
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Fig. 11. Curves showing the average feature entropy of the vowel inventories of a particular size 
versus the inventory size. The two different plots compare the average feature entropy of the 
inventories obtained from UPSID with that of the randomly constructed ones. 

3.2. Co-occurrence Principles of the Communities of VoNetrest 

In this section, we investigate whether or not the communities obtained from 
VoNetrest are better in terms of feature entropy than they would have been, if 
the vowel inventories had evolved just by chance. We construct the random edi- 
tion of VoNet^est from VoNet^an;; and apply the community-finding algorithm on 
it so as to obtain the communities. Figure [12] illustrates, for all the communities 
obtained from the clustering of VoNet^est and its random version, the average fea- 
ture entropy exhibited by the communities of a particular size (y-axis) versus the 
community size (x-axis). The curves in the figure makes it quite clear that the av- 
erage feature entropy exhibited by the communities of VoNet^est are substantially 
lower than that of their random counterpart (especially for a community size < 7) . 
As the community size increases, the difference in the average feature entropy of 
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Fig. 12. Curves showing the average feature entropy of the communities of a particular size versus 
the community size for VoNetreat as well as its random counterpart. 

the communities of VoNetrest and its random version gradually diminishes. This is 
mainly because of the formation of a single giant community, which is similar for 
the real and the random versions of VoNetrest ■ 

The above result indicate that the driving force behind the formation of the 
communities of VoNetrest is the principle of feature economy. It is important to 
mention here that the larger vowel inventories, which are usually comprised of the 
communities of VoNet,est, also exhibit feature economy to a large extent. This 
is reflected through Figure [11] where all the real inventories of size > 5 have a 
substantially lower average feature entropy than that of the randomly generated 
ones. 

3.3. Co-occurrence Principles of the Communities of VoNetrest' 

In this section we compare the feature entropy of the communities obtained from 
VoNetrest' with that of its random counterpart (constructed from VoNetrand)- Fig- 
ure [13] shows the the average feature entropy exhibited by the communities of a 
particular size (y-axis) versus the community size (x-axis) for both the real and the 
random version of VoNetrest' • The curves in the figure makes it quite clear that the 
average feature entropy exhibited by the communities of VoNetrest' are substan- 
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tially lower than that of the random ones. This rcsuh immediately reveals that it 
is again feature economy that plays a key role in the emergence of the communities 

of VoNetrest'- 



4. Conclusion 

In this paper we explored the co-occurrence principles of the vowels, across the 
inventories of the world's languages. In order to do so we started with a concise 
review of the available literature on vowel inventories. We proposed an automatic 
procedure to capture the co-occurrence patterns of the vowels across languages. We 
also discussed the notion of feature entropy, which immediately allows us to validate 
the explanations of the organizational principles of the vowel inventories furnished 
by the earlier researchers. 

Some of our important findings from this work are, 

• The smaller vowel inventories (corresponding to the communities of 
VoNetassort) tend to be organized based on the principle of maximal per- 
ceptual contrast; 

• On the other hand, the larger vowel inventories (mainly comprising of the 
communities of VoNetrest) reflect a considerable extent of feature economy; 

• Co-occurrences based on robustness are prevalent across vowel inventories 
(captured through the communities of VoNetrest') and their emergence is 
again a consequence of feature economy. 
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Until now, we have mainly emphasized on analyzing the co-occurrence principles 
of the vowel inventories of the world's languages. An issue that draws attention is 
how the forces of perceptual contrast and feature economy have interacted causing 
the emergence of the human vowel systems. One possible way to answer this ques- 
tion is by having a growth model for the network, where the growth takes place 
owing to the optimization of a function (see [4] for a reference), which involves 
the above forces and also accounts for the observed regularities displayed by the 
vowel inventories. It would be worthwhile to mention here that though most of the 
mechanisms of network growth rely on preferential attachment-based rules [1], yet 
there are scenarios which suggest that additional optimizing constraints need to 
be imposed on the evolving network so as to match its emergent properties with 
empirical data [23l [24] . Such a growth model based on some optimization technique 
can then shed enough light on the real dynamics that went on in the evolution of 
the vowel inventories. We look forward to develop the same as a part of our future 
work. 
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